Rule-based Measurement Of
نویسندگان
چکیده
Abstract: Sufficiently high data quality is crucial for almost every application. Nonetheless, data quality issues are nearly omnipresent. The reasons for poor quality cannot simply be blamed on software issues or insufficiently implemented business processes. Based on our experiences the main reason is that data quality shows the strong tendency to converge down to a level that is inherent to the existing applications. As soon as applications and data are used for other than the established tasks they were originally designed for, problems arise. In this paper we extend and evaluate an approach to measure the accuracy dimension of data quality based on association rules. The rules are used to build a model that is intended to capture normality. Then, this model is employed to divide the database records into three subsets: “potentially incorrect”, “no decision”, and “probably correct”. We thoroughly evaluate the approach on data from our automotive domain. The results it achieves in identifying incorrect data entries are very promising. In the described setting, for the first time ever it was possible to highlight a significant number of incorrect data records that otherwise disappear in the millions of correct records. This ability enables domain experts to understand what is going wrong and how to improve data quality. Moreover, our approach is a first step towards automatically quantifying the overall accuracy of a yet unseen dataset.
منابع مشابه
Reliability Measures Measurement under Rule-Based Fuzzy Logic Technique
In reliability theory, the reliability measures contend the very important and depreciative role for any system analysis. Measurement of reliability measures is not easy due to ambiguity and vagueness which exist within reliability parameters. It is also very difficult to incorporate a large amount of uncertainty in well-established methodologies and techniques. However, fuzzy logic provides an...
متن کاملS3PSO: Students’ Performance Prediction Based on Particle Swarm Optimization
Nowadays, new methods are required to take advantage of the rich and extensive gold mine of data given the vast content of data particularly created by educational systems. Data mining algorithms have been used in educational systems especially e-learning systems due to the broad usage of these systems. Providing a model to predict final student results in educational course is a reason for usi...
متن کاملA rule-based evaluation of ladder logic diagram and timed petri nets for programmable logic controllers
This paper describes an evaluation through a case study by measuring a rule-based approach, which proposed for ladder logic diagrams and Petri nets. In the beginning, programmable logic controllers were widely designed by ladder logic diagrams. When complexity and functionality of manufacturing systems increases, developing their software is becoming more difficult. Thus, Petri nets as a high l...
متن کاملتعیین دقت سونوگرافی و قانون نیگل در تخمین زمان زایمان
The Accuracy Determination of the Naegele’s Rule and Sonography for Estimating the Delivery Date R. Dehghani Firouzabadi MD , T. Botorabi , N. Tayebi GP Received: 23/09/06 Sent for Revision: 27/02/07 Received Revised Manuscript: 09/07/07 Accepted: 08/09/07 Background and Objective: Estimation of the gestational age (G.A) and the estimated date of confinement (EDC) are of paramount important fac...
متن کاملA hybrid BSC-DEMATEL- FIS approach for performance measurement in Food Industry
Organizational performance is a complex issue given that performance is a multifaceted phenomenon whose components may have distinct managerial priorities and may even be mutually inconsistent. Recently, the balanced scorecard approach (BSC), as an effective multi-criteria evaluation concept received much attention in organizational performance measurement. Although the BSC conceptual framework...
متن کاملRule-based of Monetary Policy in Iran Inspired by McCallum Rule
Economists have reached a consensus that an independent central bank could improve its policy efficiency by following a monetary policy rule. One of the important rules is McCallum rule where that requires central banks to target the growth rate of nominal GDP using the monetary base as its instrument. One of the features of the McCallum rule uses the monetary base rather than the interest rate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007